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Abstract 

Background: Tri- and tetra-nucleotide repeats in mammalian genomes can induce formation of alternative non-B 
DNA structures such as triplexes and guanine (G)-quadruplexes. These structures can induce mutagenesis, 
chromosomal translocations and genomic instability. We wanted to determine if proteins that bind triplex DNA 
structures are quantitatively or qualitatively different between colorectal tumor and adjacent normal tissue and if 
this binding activity correlates with patient clinical characteristics. 

Methods: Extracts from 63 human colorectal tumor and adjacent normal tissues were examined by gel shifts 
(EMSA) for triplex DNA-binding proteins, which were correlated with clinicopathological tumor characteristics using 
the Mann-Whitney U, Spearman's rho, Kaplan-Meier and Mantel-Cox log-rank tests. Biotinylated triplex DNA and 
streptavidin agarose affinity binding were used to purify triplex-binding proteins in RKO cells. Western blotting and 
reverse-phase protein array were used to measure protein expression in tissue extracts. 

Results: Increased triplex DNA-binding activity in tumor extracts correlated significantly with lymphatic disease, 
metastasis, and reduced overall survival. We identified three multifunctional splicing factors with biotinylated triplex 
DNA affinity: U2AF65 in cytoplasmic extracts, and PSF and p54nrb in nuclear extracts. Super-shift EMSA with 
anti-U2AF65 antibodies produced a shifted band of the major EMSA H3 complex, identifying U2AF65 as the protein 
present in the major EMSA band. U2AF65 expression correlated significantly with EMSA H3 values in all extracts and 
was higher in extracts from Stage \\\/\V vs. Stage l/l I colon tumors (p = 0.024). EMSA H3 values and U2AF65 
expression also correlated significantly with GSK3 beta, beta-catenin, and NF- B p65 expression, whereas p54nrb 
and PSF expression correlated with c-Myc, cyclin D1, and CDK4. EMSA values and expression of all three splicing 
factors correlated with ErbBI , mTOR, PTEN, and Stat5. Western blots confirmed that full-length and truncated 
beta-catenin expression correlated with U2AF65 expression in tumor extracts. 

Conclusions: Increased triplex DNA-binding activity in vitro correlates with lymph node disease, metastasis, and 
reduced overall survival in colorectal cancer, and increased U2AF65 expression is associated with total and 
truncated beta-catenin expression in high-stage colorectal tumors. 
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Background 

DNA and RNA are dynamic molecules that adopt 
several different secondary and tertiary structures. DNA 
can form a stable triple helix in which a purine- 
or pyrimidine-rich third strand forms sequence-specific 
H-bonds (Hoogsteen and reverse-Hoogsteen) with a 
purine-rich strand in the major groove of the Watson- 
Crick duplex in polypyrimidine-polypurine repeat 
sequences [1]. Guanine (G)-rich DNA and RNA can also 
form G-quadruplexes that also use Hoogsteen and re- 
verse Hoogsteen G*G bonds in a non-canonical four- 
stranded topology. G-quadruplexes specifically have 
been implicated at DNA telomere ends, the purine-rich 
DNA strands of oncogenic promoters, and in RNA 
5'-untranslated regions (UTR) near translation start 
sites [2]. For example, a nuclease-sensitive element in 
the human c-MYC promoter that can form either a 
DNA triplex or G-quadruplex interferes with DNA tran- 
scription [3]. Transient Hoogsteen base pairs have been 
detected in DNA duplexes bound to transcription fac- 
tors and in damaged DNA, suggesting that the DNA 
double helix can resonate and form excited- state Hoogs- 
teen base pairs that can expand its structural complexity 
[4]. 

Genomic instability in association with carcinogenesis 
is well established and promotes multiple hallmarks of 
cancer [5]. Repetitive DNA, such as tri- and tetranucleo- 
tide sequences, is genetically unstable, and expansions of 
such DNA repeats are associated with numerous heredi- 
tary neurological diseases including Fragile X syndrome, 
myotonic dystrophy, and Friedreich's ataxia [6,7]. Many 
of these DNA repeat sequences can exist in at least two 
different conformations, and at least 10 non-B DNA 
conformations can form, perhaps transiently, at specific 
sequences due to negative supercoiling generated by DNA 
replication, transcription, protein binding, or during DNA 
repair [8]. Non-B DNA structures such as cruciforms, tri- 
plexes and G-quadruplexes can cause mutations such as 
deletions, expansions, and translocations [9,10]. Bacolla 
et al found that genes containing long polypyrimidine- 
polypurine sequences are more susceptible to chromo- 
somal translocations than genes that do not contain these 
sequences [11]. Researchers have located "hotspot" regions 
of the genome at or near sequences with the potential to 
form non-B DNA structures, including the region in the 
promoter of the human c-MYC gene capable of forming 
triplex or G-quadruplex DNA that overlaps with one of 
the major breakpoint hotspots in c-Mrc-induced lymph- 
omas and leukemias [12,13]. The recently created Non-B 
Database (http://nonb.abcc.ncifcrf.gov) can be used to pre- 
dict the capability of a DNA sequence in mammalian gen- 
omes to form any of a variety of non-B structures [14]. 

While the existence of triplex or G-quadruplex nucleic 
acids in vivo has yet to achieve mainstream acceptance, 



eukaryotic proteins that recognize and bind to these alter- 
native structures do exist. For example, the Fragile X men- 
tal retardation protein (FMRP) binds an intramolecular 
G-quartet in target mRNAs, and loss of function of this 
protein causes the Fragile X mental retardation syndrome 
[15]. We have studied proteins in Saccharomyces cerevi- 
siae and HeLa carcinoma cells that bind specifically to a 
purine- motif triplex DNA probe in gel shifts (EMS A) 
where the third strand is G-rich and photo-crosslinked 
with a psoralen group (Ps~) [16-18]. Stml, the major 
purine-motif triplex DNA-binding protein in S. cerevisiae, 
also binds to G-quartet DNA and RNA in vitro [19]. Using 
Southwestern blotting where HeLa nuclear extracts were 
separated by SDS-PAGE, blotted and probed with the 
same radio-labeled purine triplex DNA used in EMSA, we 
found that 100-, 60-, and 15-kDa bands were hybridized 
with the triplex DNA probe, whereas only the 100-kDa 
band was also hybridized with the parent duplex DNA 
probe [16]. RecQ- family helicases, including the WRN 
helicase, have been shown to preferentially bind to and 
unwind aberrant DNA structures such as triplex and 
G-quadruplex DNAs, which are believed to exist in vivo as 
intermediates in DNA replication, recombination, and 
repair. The WRN helicase is deficient in patients with 
Werner syndrome, an autosomal recessive disease causing 
premature aging that is associated with numerous age- 
related phenotypes, including a high predisposition to can- 
cer [20]. Others have examined specific aspects of WRN 
expression in colorectal cancer, such as the presence of 
allelic variants and colorectal cancer risk and WRN pro- 
moter methylation as it correlates with a CpG island 
methylation phenotype (CIMP)-high diagnosis [21,22], 
These studies led us to question whether triplex DNA- 
binding proteins and WRN helicase expression are quanti- 
tatively and/or qualitatively different in human colorectal 
tumors and corresponding normal tissues, if there is any 
correlation with clinical prognosis, and identify purine- 
motif triplex DNA-binding proteins in human cells. 

Numerous genetic, cytogenetic, and epigenetic aberra- 
tions act at specific stages in colorectal cancer initiation and 
progression and influence response to therapy, such as 
inactivation of tumor suppressor APC as an initiating event 
and KRAS or BRAF mutations as markers of non-response 
to EGFR-targeted therapy [23]. High-throughput studies 
have suggested the existence of additional undiscovered 
cancer genes that may promote colorectal cancer develop- 
ment [24-26]. Colorectal cancer is also one of the more 
genetically unstable cancers, with about 65% of sporadic 
adenomas and cancers being characterized by chromosomal 
instability (CIN), 10-15% characterized by microsatellite in- 
stability (MSI), and approximately 20% having a CIMP 
phenotype, with some overlap among these characteristics. 

We have found higher triplex DNA-binding activity 
in vitro in colorectal tumor extracts than in corresponding 
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normal tissue extracts using EMSA, and that this increased 
binding activity correlated significantly with the spread of 
cancer to the lymph nodes, metastasis, and reduced overall 
survival We also found that expression of the triplex/ 
G-quadruplex-unwinding helicase WRN correlated signifi- 
cantly with total triplex DNA-binding activity in EMSAs in 
both normal and tumor tissue extracts. Biotin purine-motif 
triplex DNA affinity identified three multifunctional spli- 
cing factors: U2AF65, PSF, and p54nrb, and an anti- 
U2AF65 antibody produced a super-shifted EMSA band. 
High U2AF65 expression was associated with advanced 
colon tumor stages and with p54nrb and PSF expression in 
tumors. U2AF65 expression also correlated significantly 
with both total and truncated beta-catenin, as well as 
NF- B p65, PCNA, EGFR, mTOR, PTEN, and Stat5 in 
colorectal tumors. 

Materials and methods 

Preparation of cytoplasmic and nuclear extracts of tis- 
sue and cell lines. Tissue samples of tumor and adjacent 
normal mucosa were collected after surgical resections 
after informed consent, verification by a pathologist, 
and snap-frozen in liquid nitrogen. The patients had not 
previously received any chemotherapy, therefore the tis- 
sues are chemotherapy naive. Frozen tissue samples 
were prepared as described by Asangani et al. [27] . The 
samples were pulverized with a Sartorius Mikrodismem- 
brator, then extracted for 30 min on ice with Schaffner 
lysis buffer A (10 mM HEPES-Na + pH 7.9, 10 mM KC1, 
0.1 mM EDTA pH 8.0, 0.1 mM EGTA pH 8, 1 mM 
dithiothreitol, 0.5% Triton X-100, Sigma phosphatase 
inhibitor cocktail 2, and Roche Complete Mini protease 
inhibitor) and centrifuged at 13,000 rpm, 4°C in a 
microcentrifuge to produce cytoplasmic extracts. The 
nuclear pellet was extracted for 30 min on ice with 
Schaffner buffer C (20 mM HEPES-Na + pH 7.9, 0.4 M 
NaCl, 0.1 mM EDTA pH 8.0, 0.1 mM EGTA pH 8.0, 
1 mM dithiothreitol, 20% glycerol, with phosphatase 
and protease inhibitors) and centrifuged at 13,000 rpm, 
4°C in a microcentrifuge to produce nuclear extracts 
[28]. Total protein concentrations were determined 
using the Pierce BCA Protein Assay kit. Colorectal can- 
cer cell lines and HeLa cytoplasmic or nuclear extracts 
were similarly prepared using Schaffner buffers A and 
C, respectively. 

Purine-motif triplex DNA formation and 33 P-labeling 

Purine triplex DNA oligonucleotide sequences and probe 
formation were as previously described [16,17]. The parent 
duplex oligonucleotides are PuGA: 5' - AATTCCTAAGG 
GAGGGGAGGGGAGGGTAGCT - 3' and complementary 
strand PuCT: 5' - AGCTACCCTCCCCTCCCCTCCCT 
TAGG - 3'. The parent duplex DNA was made by anneal- 
ing equimolar (0.1 mM) concentrations of the PuGA and 



PuCT oligonucleotides at room temperature after boiling 
for 2 min in 40 mM Tris-HCl pH 8.0, 10 mM MgCl 2 , 0.01% 
NP-40. The purine-motif triplex-forming oligonucleotide 
(TFO) contained a 4 , -(hydroxymethyl)-4,5, , 8-trimethylpsora- 
len-hexyl (Ps~) moiety at the S'-terminus (Eurogentec): 
5' - Ps ~ GGG TGG GGT GGG GTG GGT -3'. To form 
triplex DNA, the parent duplex DNA and a 10-fold molar 
excess of TFO were incubated for 4 h at 30°C in 40 mM 
Tris HC1 pH 8.0, 100 mM MgCl 2 , 0.01% NP-40. Psorale- 
nated TFO was then cross-inked with the parent DNA du- 
plex with a 366 nm UV transilluminator for 10 min on ice. 
Purine triplex DNA (1 x 10" 7 M) was 3' end-labeled with T4 
kinase (New England Biolabs) and y- 33 P dATP for 1 h at 37° 
C. Unincorporated labeling dATP was removed from the 
reaction by centrifuging the reaction mixture with an equal 
volume of 10 mM Tris-HCl pH 8.0, 10 mM MgCl 2 , 
0.05% Triton X-100 through a G25 Microspin column 
(GE Healthcare). 

Electrophoretic mobility shift assay (EMSA) 
and super-shift EMSA 

Gel shifts were also done as previously described 
[16,17]. In this study 5 \ig total protein from tissue 
extracts or 1.5 \ig HeLa or colorectal cancer cell line 
cytoplasmic or nuclear extracts were mixed with 1 nM 
33 P-labeled purine triplex DNA and 2 \ig poly (dldC) 
carrier DNA in binding buffer (25 mM HEPES-Na + pH 
7.9, 50 mM KC1, 10% glycerol, 0.5 mM dithiothreitol, 
2 mM MgCl 2 ) for 30 min at room temperature. Protein- 
triplex DNA probe complexes were resolved by nonde- 
naturing PAGE at 7 V/cm for 90 min through a 5% 
acrylamide/0.25% bisacrylamide gel containing 22 mM 
Tris borate, 0.5 mM EDTA, and 5% glycerol. Protein- 
probe complexes were visualized using autoradiography 
and quantitated with a Storm 840 Phosphorlmager 
(Molecular Dynamics). Major EMSA H3 bands from 
each tissue sample were normalized by dividing by the 
H3 band value of HeLa nuclear extract present in each 
gel. For super-shift EMSA, protein extracts were incu- 
bated in the same binding buffer with purine triplex 
DNA probe for 30 min at room temperature, then 
400 ng of anti-U2AF65 MC3 antibody or mouse IgG 
antibody as a negative control (Santa Cruz) were added 
to the reaction and incubated for 1 h at room 
temperature. PAGE gels were run as for regular EMSA 
with the addition of a circulating cooling water bath to 
the gel apparatus. 

Statistical correlations 

The Wilcoxon Sign Rank Test was used to compare the 
level of the major EMSA H3 complex and WRN 
expression in total, cytoplasmic, and nuclear extracts of 
colorectal tumors and corresponding normal tissues. 
The Mann- Whitney U test was used with SPSS version 
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13.0 to compare quantitative variables in two independent 
groups. Spearman correlations among continuous vari- 
ables were computed. Chi square (Bonferroni-corrected) 
were used for grouped/dichotomized variables. Survival 
was estimated using Kaplan-Meier analysis, and differ- 
ences were calculated using Mantel-Cox log-rank statis- 
tics; primary endpoints were tumor-related death (disease- 
specific survival), death (overall survival), and tumor re- 
currence (recurrence-free survival, RO-patients only). The 
following variables were dichotomized according to the 
median value: protein levels in nuclear and total extracts 
(cytoplasm and nucleus) ratios (tumor/ normal) as high 
levels in tumor (values above the median) vs. low levels in 
tumor (values below the median) as compared with nor- 
mal tissue, involved lymph nodes as pNO vs. pNl-3, 
distant metastasis as MO vs. Ml, surgical curability as 
curative vs. non-curative resection (RO vs. Rl/2). 

Purification of triplex DNA-binding proteins using biotin/ 
streptavidin affinity 

Biotinylated purine triplex DNA was formed using a 3' bio- 
tinylated PuCT oligonucleotide (Eurogentec): 5' - AGC 
TACCCTCCCCTCCCCTCCCTTAGGAATTTT-biotin- 
3' annealed to the PuGA complementary strand, then 
annealed and crosslinked with the Ps ~ TFO as described 
above. Purification of DNA-binding proteins using bio- 
tin/streptavidin affinity systems, as described in Current 
Protocols in Molecular Biology [29], was performed in 
separate 2 ml reactions containing either 800 \ig RKO 
colorectal cancer cell nuclear extract or 1085 \xg RKO 
cytoplasmic extract, EMSA binding buffer (25 mM 
HEPES-Na + pH 7.9, 50 mM KC1, 10% glycerol, 0.5 mM 
dithiothreitol, 2 mM MgCl 2 ), 600 \ig poly (dldC), 1 nM 
biotinylated purine triplex DNA, and 150 \A pretreated 
streptavidin agarose (Fluka) while rotating for 2 hr at 
room temperature. Streptavidin agarose was gently pel- 
leted and washed three times with binding buffer. 
Laemmli buffer was added directly to the agarose pellet 
and boiled for 5 min to elute bound protein(s). Proteins 
were separated using 10% SDS-PAGE and stained with 
Coomassie blue. Two bands (100 and 60 kDa) from the 
nuclear extract reaction and one band (65 kDa) from the 
cytoplasmic extract reaction were excised from the gel 
and submitted to the German Cancer Research Center 
(DKFZ) Functional Proteome Analysis laboratory for 
sequencing and analysis using nano-HPLC ESI-MS-MS 
and identified using MASCOT database searches. 

Western blotting 

Western blot analysis was performed using standard 
procedures as described in Current Protocols in Molecu- 
lar Biology [27]. 25 ug total protein from tissue or cell 
line cytoplasmic or nuclear extract was separated by 10% 
SDS-PAGE, then electro-transferred to nitrocellulose 



membranes in 25 mM Tris, 190 mM glycine with 20% 
methanol. After blocking in 5% milk in Tris-buffered sa- 
line with 0.2% Tween-20 (TBST) for 1 hr at room 
temperature, membranes were incubated with antibodies 
against WRN (H-300 Santa Cruz sc-5629, 1:500), 
U2AF65 (MC3 Santa Cruz sc-53942, 1:2000), PSF (39-1 
Santa Cruz sc-101137, 1:2000), p54nrb (H-85 Santa 
Cruz sc-67016, 1:2000) in 5% milk-TBST for 1 hr at 
room temperature, or beta-catenin (L87A12 Cell 
Signaling CS-2698, 1:1000) or actin (Sigma A2066, 
1:1000) in 5% milk in TBST overnight at 4°C. Blots were 
washed with TBST, incubated with the appropriate HRP- 
conjugated secondary antibody at 1:4500, and detected by 
enhanced chemiluminescence (Pierce, Thermo Scientific) 
and autoradiography. Protein bands were quantitated by 
densitometry using NIH Image J software and normalized 
to actin. 

Reverse phase protein array (RPPA) 

RPPA was performed as described by Mannsperger et al. 
[30]. 2.7 ng cytoplasm or 2.8 ng nuclear protein extract 
per spot was printed with a non-contact spotter onto 
nitrocellulose slides (Oncyte Avid, Grace Bio-labs, Bend 
OR) using an Aushon 2470 Microarrayer (Billerica, 
MA). Slides were mounted in a customized incubation 
chamber (Metecon, Mannheim Germany), blocked for 
1 hr at room temperature with 50% (v/v) Odyssey block- 
ing buffer in PBS and individually stained with 37 vali- 
dated primary antibodies at 1:300 in blocking buffer at 
4°C overnight and Alexa 680-labeled secondary anti- 
bodies (Invitrogen) at 1:8000 in PBS with 0.05% Tween 
for 1 hr at room temperature. Slides were scanned with 
the Licor Odyssey system and spot intensities were cal- 
culated with GenePix Pro 5.0 microarray analysis soft- 
ware (Molecular Devices). To estimate the total protein 
concentration per spot, a slide from each run was 
stained with Fast Green FCF (Sigma-Aldrich) as 
described by Loebke et al. [31]. Data analysis was done 
using R with the RPPanalyzer package from CRAN 
(http://cran.r-project.org, [32]). For each antibody the 
logged mean of the raw foreground pixel intensities of a 
single spot was subtracted by the corresponding logged 
Fast Green FCF signal to normalize for the total protein 
per spot. 

Results 

Colorectal tumors have higher triplex DNA-binding 
activity than corresponding normal tissue 

A summary of clinical characteristics of the 63 study 
patients are shown in Table 1. To examine purine-motif 
triplex DNA-binding proteins, cytoplasmic and nuclear 
extracts from 63 colorectal cancer patients' tumor and cor- 
responding normal tissues were isolated and examined by 
gel shifts (EMSA). Figure 1 presents examples of EMS As 
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Table 1 Patient clinical characteristics 



Tumor characteristics 


Absolute (n = 63) 


Relative (%) 


Sex 


Male 


46 


73% 




Female 


17 


27% 


Localization 


Colon 


36 


57.1% 




Rectum 


27 


42.9% 


TNM Staging 


pT1 


10 


15.9% 




pT2 


5 


7.9% 




pl3 


34 


54% 




pT4 


16 


ZdA/o 


Lymph Node Status 


pNO 


37 


58.7% 




pN1 


16 


25.4% 




pN2 


10 


15.9% 


Metastasis Staging 


MO 


42 


66.7% 




M1 


21 


33.3% 


Status 


alive 


43 


68.1% 




dead 


20 


31.7% 



from eight patients representing all four tumor stages, 
where in most samples one major band (H3) is present in 
varying amounts. In some patients, tumor cytoplasmic 
extracts contained a higher amount of the major H3 com- 
plex than normal or tumor nuclear extracts (patients 1 and 
5), while in other patients, tumor nuclear extracts con- 
tained a higher amount of the major H3 complex (patients 
6 and 8). Cytoplasmic and nuclear extracts from HeLa cells 
were included as positive controls. Normalized EMSA H3 
values are listed below each sample. To verify that the 
major EMSA H3 band is specific for the triplex DNA 
probe, the 33 P-labeled parent duplex DNA probe lacking 
G*G base pairs did not produce the major H3 complex in 
patient tissue or HeLa nuclear extracts (Additional file 1: 
Figure SI). EMSA H3 binding values were generally higher 
in tumor than normal tissue, whether evaluating cytoplas- 
mic extracts (mean = 0.512, median = 0.509 for tumor tis- 
sue; mean = 0.386, median = 0.384 for normal tissue) or 
nuclear extracts (mean = 0.361, median = 0.368 for tumor 
tissue; mean = 0.264, median = 0.228 for normal tissue) as 
shown in Figure 2. Wilcoxon sign rank test results showed 
significantly higher triplex DNA EMSA binding activity in 
tumor than normal extracts when examining total mea- 
sures (p = 0.001), cytoplasmic extracts only (p = 0.001) and 
nuclear extracts only (p = 0.0 12) (Additional file 2). We also 
performed EMSA analysis of cytoplasmic and nuclear 
extracts of eight colorectal cancer cell lines (GEO, SW480, 
HT29, HCT116, Colo206F, wiDR, Colo320, and RKO) and 
found that all eight cell lines had a triplex DNA-binding 
protein pattern that was very similar to HeLa extracts, with 
a moderate amount of the major H3 band produced by 
cytoplasmic extracts and an abundant amount of the H3 
band produced by nuclear extracts (Additional file 1: Figure 
S2a). 



Increased triplex DNA-binding activity in colorectal 
tumors correlates with lymph node disease, metastasis, 
and overall survival 

We wanted to investigate whether the amount of the 
EMSA H3 complex correlated with patient clinicopatholo- 
gical data and overall survival. Median follow-up time for 
patient clinical data was 28.9 months. Normalized EMSA 
data of patient samples were correlated with clinical risk 
factors and computed for univariate prognostic impact. We 
observed that lymph node disease (N-Stage) was signifi- 
cantly associated with the ratio of tumor/normal (T/N) 
triplex-binding activity for cytoplasmic and nuclear extracts 
and total values (p = 0.026; 0.019; 0.017, respectively, 
Table 2a). This meant that all patients without lymph node 
disease at diagnosis had significantly decreased binding 
ratios (T/N) in both cytoplasmic and nuclear extracts. Also, 
the triplex DNA-binding activity in tumor nuclear extracts 
and total tumor extracts correlated significantly with me- 
tastasis (p = 0.031, p = 0.046, respectively, Table 2b). Kaplan- 
Meier survival analysis using a median cut-off of 1.5 
(rounded-up) for the nuclear binding activity ratio (T/N) 
showed significantly lower overall survival in patients 
whose T/N nuclear binding activity ratio was greater than 
1.5 (n = 30; p = 0.026) than in patients whose ratio was less 
than 1.5 (n = 33, Figure 3, Additional file 2). This suggested 
that although triplex DNA-binding protein(s) were present 
in normal colorectal tissue extracts, they were more abun- 
dant in tumor extracts. It also suggested that an abundance 
of the major triplex-binding EMSA complex (H3) in the 
nuclei of tumor cells was associated with metastasis and 
reduced overall survival (Additional file 3). 

Identification of U2AF65 as the protein present 
in the EMSA H3 complex 

We wished to identify the protein(s) responsible for binding 
the triplex DNA probe in the major EMSA H3 complex. 
We isolated biotinylated purine-motif triplex DNA-protein 
complexes from RKO cells with streptavidin-conjugated 
agarose, separated the complexes by SDS-PAGE, and 
stained with Coomassie Blue. Protein bands were ana- 
lyzed by nano-HPLC ESI-MS-MS and identified using 
MASCOT database searches. We identified (1) 100-kDa 
and (2) 60-kDa proteins from nuclear extracts and a (3) 
65-kDa protein from cytoplasmic extracts. These corre- 
sponded to the following proteins: 

(1) PSF (polypyrimidine tract binding-associated 
splicing factor, or SFPQ) [NCBI Protein AAH04534] 

(2) P54nrb (nuclear RNA-binding protein) or NonO 
[NCBI Protein NP_031389] 

(3) U2AF65 (U2 small nuclear RNA auxiliary factor 2 
isoform b) [NCBI Protein NP_001012496] 
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a 



Stage 2 
Patient 1 



Stage 3 
Patient 2 



Stage 3 
Patient 3 



Stage 1 
Patient 4 



HeLa 



HeLa 



cy nu cy nu cy nu cy nu cy nu 



cy nu cy nu cy nu cy nu cy nu 



H3 





II 



.22 .13 .86 .69 



Stage 3 
Patient 5 



.44 .18 .66 .67 



Stage 4 
Patient 6 



.24 .17 .54 .60 



Stage 4 
Patient 7 



.29 .09 .43 .38 



Stage 4 
Patient 8 



HeLa 



HeLa 



cy nu cy nu cy nu cy nu cy nu 



cy nu cy nu cy nu cy nu cy nu 



H3i 




illll 



.46 .23 1.0 .38 




.28 .31 .89 .92 



.25 .21 .80 .39 



.43 .07 .55 .66 



Figure 1 Electrophoretic Mobility Shift Assay (EMSA) of Resected Tissue Extracts with Purine Triplex DNA. 33 P-labeled purine-motif triplex 
DNA (1 nM) was complexed with 5 ug total protein from normal cytoplasmic (N cy), normal nuclear (N nu), tumor cytoplasmic (T cy) or tumor 
nuclear (J nu) extracts of tissues obtained from eight selected colorectal cancer patients. 1.25 ug HeLa cytoplasmic and nuclear extracts were 
used as positive (+) controls. The major EMSA band was H3, indicated with an arrow. Normalized EMSA H3 values are listed below the 
corresponding samples. The purine-motif triplex probe alone is shown in lane 1. 



PSF and p54nrb are known to function as RNA polymer- 
ase II-associated splicing factors, bind as heterodimers, and 
are implicated in the regulation of expression of the Myc 
family of oncoproteins, COX2, etc. They also bind to and 
stimulate topoisomerase I and promote homologous DNA 



pairing and the incorporation of a single-stranded oligo- 
nucleotide into homologous superhelical double-stranded 
DNA D-loop formation [33,34], U2AF65, identified from 
cytoplasmic extracts, is also an RNA polymerase II- 
associated splicing factor that can associate with mRNAs 
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N cyto 



N nuc 



r~ 

p = 0.001 

L_ 




T cyto 



T nuc 



I 

p = 0.012 



p = 0.001 

Figure 2 Normalized EMSA H3 values for all 63 colorectal cancer patient extracts. Box plots indicating the median normalized EMSA H3 
values (white lines in the boxes), upper and lower quartiles (25th through 75 th percentiles defined by the shaded box), and ranges of data values 
for each extract type. N cyto, cytoplasmic normal tissue extracts; N nuc, nuclear normal tissue extracts; T cyto, cytoplasmic tumor tissue extracts; 
Tnuc, nuclear tumor tissue extracts. 



that include a predominance of transcription factors and 
cell cycle regulators, and shuttle continuously between the 
nucleus and cytoplasm [35,36]. 

Super-shift EMSA with a well- characterized monoclonal 
antibody against U2AF65 [37] consistently produced a 



super-shifted H3 band in all human extracts tested that 
were known to express U2AF65 by Western blot analysis 
(RKO and tumor tissue cytoplasmic and nuclear extracts 
are shown in Figure 4). This confirmed that U2AF65 is 
present in the H3 triplex DNA-protein complex observed 



Table 2 Correlation of the ratio of tumor (T) to normal (N) (T/N) EMSA H3 values for each patient with clinical features: 
test statistics (a) by presence of disease in lymph nodes (N-Stage) and (b) by presence of metastasis in distant organs 
(distant metastasis) 



(a) Grouping Variable: presence of disease in lymph nodes (N Stage) dichotomized 




Ratio T/N 


Ratio T/N 


Ratio T/N 




cytoplasm 


nucleus 


total 


Mann- Whitney U 


322.000 


313.000 


310.500 


Wilcoxon W 


1025.000 


1016.000 


1013.500 


Z 


-2.220 


-2.345 


-2.380 


Asymp. Sig (2-tailed) 


0.026 


0.019 


0.017 


(b) Grouping Variable: presence of metastasis in distant organs (distant metastasis) 








Cytoplasm tumor 


Nucleus tumor 


Total tumor 


Mann-Whitney U 


342.500 


309.500 


321.000 


Wilcoxon W 


1245.500 


1212.500 


1224.000 


Z 


-1.689 


-2.156 


-1.993 


Asymp. Sig (2-tailed) 


0.091 


0.031 


0.046 
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Figure 3 Overall Survival according to the tumonnormal (T/N) colorectal tissue nuclear triplex DNA-binding activity ratio. Cut-off = 
(rounded-up median). 



1.5 



by EMS A (Figure 4). Available antibodies against PSF or 
p54nrb did not produce any super-shifted bands in our 
EMSA analysis (Additional file 1: Figure S3). 

U2AF65 expression correlates with EMSA H3 values 
and p54nrb and PSF expression in tumor tissues 
and with a higher tumor stage 

We measured expression of the three splicing factors in 
normal and tumor colorectal tissue extracts obtained 



from 51 of the 63 patients using Western blotting to 
determine if triplex DNA-binding activity in EMSA 
correlates directly with U2AF65, PSF, and/or p54nrb 
total protein expression. Spearman correlations indi- 
cated that U2AF65 expression correlated significantly 
with EMSA H3 values, and that the correlation was 
highly significant in tumor extracts (cytoplasmic 
p = 1.8e-8; nuclear p = 5.9e-5; total p = 1.8e-8; Table 3a, 
Additional file 4). In comparison, PSF and p54nrb were 



RKO cyto 



RKO nuc 



Patient T cyto Patient T nuc 



"To 7T Ti 12 



it— * 




H3 supershift 
H3 



antibody 

Figure 4 Production of a super-shifted H3 band in RKO and patient tissue extracts by super-shift EMSA with a monoclonal antibody 
against U2AF65. 33 P-labeled triplex DNA (1 nM) was complexed with 1.5 ug total protein from RKO cytoplasmic (lanes 2-4), RKO nuclear (lanes 
5-7), 5 ug tumor cytoplasm (T cyto lanes 8-10) or tumor nuclear (T nuc lanes 11-12) extracts. Lanes 2, 5, 8, and 1 1, no antibody; lanes 3, 6, 9, and 
12, 400 ng anti-U2AF65 antibody MC3; lanes 4, 7, and 10, mouse IgG antibody (negative control). Each reaction also contained 2 ug poly (dl-dC) 
carrier DNA. Lane 1, triplex DNA probe alone. 
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Table 3 (a) Spearman correlation p values of EMSA H3 values with expression of triplex DNA-binding proteins (3BP) 
and (b) correlations of U2AF65 expression to PSF and p54nrb expression in normal and tumor tissue extracts 



jDr CAfJI COOlUI 1 UUI 1 CldlcU 

With FMSA l-R 

Willi tivun nj 


U2AF65 


p54nrb 




PSF 


IMUIIIIdl LyLUfJIdblll 


n nn^ft 

u.uuzo 


n qi 

U.Z7 1 




U.OD 


IMUl llldl IIULItrUb 


n ni ^ 

U.U 1 D 


nil 

U. 1 1 




n n7^ 

U.U/ D 


M formal total 
IMUIIIIdl lUldl 


0.016 


0.26 




0.095 


i umui v.y LUfJidbiii 


1 .8e-08 


0.53 




0.019 


Tumor nucleus 


5.9e-05 


0.0071 




0.036 


Tumor total 


1.8e-08 


0.0048 




7.5e-05 


Correlation to U2AF65 


p54nrb 




PSF 




Normal cytoplasm 


0.00066 




0.085 




Normal nucleus 


0.00037 




0.073 




Normal total 


0.00040 




0.094 




Tumor cytoplasm 


0.0041 




0.0002 




Tumor nucleus 


1e-06 




0.0005 




Tumor total 


1e-06 




1e-06 





highly expressed in nuclear extracts but seldom detected 
in cytoplasmic extracts, and their expression correlated 
with EMSA H3 values only in tumor nuclear extracts 
(p = 0.036 and 0.0071, respectively) (Table 3a). When cor- 
relating the expressions of the three splicing factors with 
each other, PSF and p54nrb were highly significantly asso- 
ciated in nuclear extracts of both normal and tumor tissue 
(p = le-6 in both) as expected, as they are known to bind 
and function as heterodimers. Also, U2AF65 expression 
was highly significantly correlated with p54nrb expression 
in both normal and tumor nuclear extracts (p = 0.00037 
and le-6, respectively) (Table 3b), but with PSF expression 
only in tumor nuclear extracts (p = 0.0005), suggesting a 
unique functional aspect of U2AF65 and PSF in tumor cell 
nuclei. We also examined expression of the three splicing 
factors identified by biotin triplex DNA affinity in the 
eight colorectal cancer cell lines using Western blotting. 
Consistent with patient tissue data, U2AF65 expression 
from all cell line extracts most closely matched the 
abundance of the EMSA H3 band, with moderate 
expression in all cytoplasmic extracts and abundant ex- 
pression in all nuclear extracts (Additional file 1: 
Figure S2b). 

Having shown that the EMSA H3 complex was 
increased in tumor compared to adjacent normal tissue, 
we wished to determine if U2AF65, p54nrb and PSF ex- 
pression was associated with tumor stage. U2AF65 pro- 
tein expression according to extract type and tumor 
stage in all colon tumors is shown in Figure 5. Colon 
tumors in Figure 5 in advanced clinical stages, UICC 
Stage III and IV (Dukes C and D) express significantly 
higher U2AF65 in the cytoplasm and overall than did 
tumors at early stages (mean value of U2AF65 tumor 
cytoplasm UICC Stage I and II expression = 0.349 vs. 



UICC Stage III and IV = 0.491; p = 0.024 [Mann- Whitney 
U-Test, Additional file 5]). PSF and p54nrb expression 
were not significantly correlated with tumor stage. While 
both p54nrb and PSF expression were significantly cor- 
related with EMSA H3 values in tumor but not normal 
tissue extracts, the antibodies against these proteins that 
we tested were unable to produce a super-shifted EMSA 
band. Thus the relevance of p54nrb and PSF as triplex 
DNA-binding proteins remains to be determined. 

Expression of the WRN helicase correlates with EMSA H3 
binding activity 

We wanted to test the hypothesis that proteins that bind to 
or stabilize triplexes and G-quadruplexes can act in a yin- 
yang fashion (in complementary opposition) with proteins 
such as helicases that unwind or destabilize these struc- 
tures, and that expression and/or function of these binding 
and unwinding proteins may be imbalanced in tumors that 
could contribute to genomic instability. We tested 51 pa- 
tient colorectal tumor and normal tissue extracts for ex- 
pression of the RecQ-family helicase WRN because it is 
known to act preferentially on aberrant structures such as 
triplexes and G-quadruplexes and to promote genomic in- 
tegrity [19]. We used the Wilcoxon sign rank test to deter- 
mine if WRN is differentially expressed in normal and 
tumor tissue extracts and Spearman's rho to correlate 
WRN helicase expression in normal and tumor tissue 
extracts with EMSA H3 data. We detected no significant 
differences in normalized WRN expression between normal 
and tumor extracts or according to tumor stage (mean 
cytoplasmic expression in tumor tissue = 0.424, in normal 
tissue = 0.283; mean nuclear expression in tumor tissue- 
= 0.275, in normal tissue = 0.196; total expression mean in 
tumor tissue = 0.679, in normal tissue = 0.465). However, we 
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Figure 5 U2AF65 protein expression by colon tumor stage. Total protein (25 ug) from cytoplasmic (cyto) and nuclear (nuc) colon tumor and 
normal tissue extracts were separated using 10% SDS-PAGE and electro-transferred to nitrocellulose membranes. Blots were incubated with 
anti-U2AF65 antibody MC-3 and detected using chemiluminescence and autoradiography. Blots were reprobed with an anti-actin antibody, and 
densitometry was performed using NIH Image J software. U2AF65 expression values were normalized by dividing the actin expression values in 
each extract, and plotted according to colon tumor stage using the R program (Additional file 6). 



did observe that total WRN expression correlated signifi- 
cantly with total EMSA H3 binding values in both normal 
tissue (rho 0.296, p = 0.03) and tumor extracts (rho 0.460, 
p< 0.001). 

Reverse-phase protein array and western blot analysis 
of tissue extracts show a correlation of U2AF65 expression 
with total and truncated beta-catenin expression 

Another goal of our study was to measure the expression 
of numerous cancer-relevant proteins in patient tissue 
extracts and correlate it with EMSA H3 values and expres- 
sion of the three splicing factors identified using biotin 
triplex DNA affinity as a screen to identify potentially rele- 
vant functional relationships among these splicing factors 
and other well-characterized proteins. Using reverse-phase 
protein array (RPPA) analysis, we examined extracts from 
51 patients (because not all extracts met the minimum 
concentration needed for accurate measurement) for ex- 
pression of cancer-related proteins with 37 previously vali- 
dated antibodies. Spearman correlation of the expression of 
multiple signaling proteins was calculated. Significant cor- 
relations after Bonferroni correction for multiple testing 
were found with both EMSA H3 values and U2AF65 
expression, including NF- B p65, GSK3 beta, beta-catenin, 



Src, and PI3K pi 10 alpha (Table 4; exact p values are 
shown in Additional file 7: Table SI). The expression levels 
of a distinct set of proteins were found to correlate signifi- 
cantly with both p54nrb and PSF expression, such as cyclin 
Dl, c-Myc, JNK1, CDK4, Aktl, and Stat3. Expression of all 
three splicing factors and EMSA H3 values also signifi- 
cantly correlated with another set of proteins including p38 
alpha, ErbBl (EGFR), mTOR, PTEN, and Stat5. 

The most highly significant correlation in our RPPA 
analysis was that between U2AF65 expression and beta- 
catenin (p = 9e-10), known to be deregulated and a 
major player in the etiology of colorectal cancer. To con- 
firm our RPPA results, we compared Western blots of 
beta-catenin and U2AF65 expression in tissue extracts 
from 50 patients. Representative Western blots for six 
patients are shown in Figure 6, which includes some pa- 
tient samples also shown in Figure 1 EMSAs. These data 
were quantitated by densitometry and graphed in 
Additional file 1: Figure S4. According to Spearman's 
rho, we observed that total beta-catenin and U2AF65 
expression are highly significantly correlated in cytoplas- 
mic and nuclear tumor extracts (p = 5.7e-6 and p = 3.1e- 
6, respectively), while their expression correlated signifi- 
cantly in normal nuclear extracts (p = 0.0018), and 
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Table 4 Spearman correlations of EMSA H3 values and 
triplex DNA-binding protein expression to other proteins 
by reverse phase protein array (RPPA) 

EMSAH3 U2AF65 p54nrb PSF 

Ph-Erk ** 
NF-kB p65 *** *** 

Cyclin D1 *** *** 

GSK3P *** * 



PCNA *"* 

(3-catenin *** *** 

Ph-Raf ** 
Src 

p 23 q ** ** *** *** 

Cdk4 ** *** 

Aktl ** *** 

ErbBI *** *** 
Bcl-2 

fY^"y"Qi^ ^"^"^ •^•K"^ , x ,, 5f"5f 

P13Kp110a ** *** 

PLC y *** 

p-|-£|\| 4€"5€"5€- *** -X-Sf-Sf 

Stat3 * ** *** 

Stat5 *** *** *** *** 



*p < 0.05; **p < 0.01; ***p < 0.001 . 

showed no significant correlation in normal cytoplasmic 
extracts (p = 0.15). In addition, beta-catenin expression 
was higher in cytoplasmic and nuclear extracts of stage 
III and IV colon tumors than in those of stage I and II 
colon tumors (Additional file 1: Figure S5). Western 
blots of beta-catenin expression showed truncated bands 
(65-80- kDa) for some extracts but not for others, which 
was consistent with previous reports of truncated or 
novel spliceforms of beta-catenin mRNA [38,39] and an 
80-kDa truncated beta-catenin protein [40] in colorectal 
cancer. In addition to a significant correlation bet- 
ween full-length beta-catenin (92-kDa) expression and 
U2AF65 expression, we found a significant correlation 
between truncated beta-catenin and U2AF65 expression, 
particularly in the cytoplasm (p = 0.0047) and nuclei 
(p = 0.022) of tumor cells. 

Discussion 

The data provides support to the hypothesis that the 
major triplex DNA-binding protein in human cells is 
more abundant and has higher binding activity in vitro 
in extracts from colorectal cancer tissues compared to 
adjacent normal tissues. This increased binding activity 



correlated significantly with the expression of triplex/ 
G-quadruplex DNA-unwinding helicase WRN, and with 
the spread of cancer to the lymph nodes, metastasis, and 
reduced overall survival. The major triplex DNA-binding 
protein in gel shifts was identified as the U2AF65 spli- 
cing factor. U2AF65 expression was higher in more 
advanced colon tumor stages and correlated significantly 
with total and truncated beta-catenin expression. 

U2AF is a non-small nuclear ribonucleoprotein (snRNP) 
splicing factor required for the binding of U2 snRNP to 
the pre-mRNA branch site [41,42]. Purified U2AF is com- 
prised of two polypeptides of 65- (U2AF65) and 35-kDa 
(U2AF35), respectively. U2AF65 binds to the polypyrimi- 
dine (Py) tract adjacent to the 3' splice site using RNA- 
recognition motifs and cross-links to the branch point in 
an ATP-independent manner at the earliest stage of spli- 
ceosome formation [43]. Both subunits of U2AF are essen- 
tial for the viability of many model organisms, such as 
zebra fish, Drosophila, C. elegans, and S. pombe [44]. Both 
U2AF65 and U2AF35 shuttle continuously between the 
nucleus and cytoplasm by a mechanism that involves car- 
rier receptors and is independent from binding to mRNA. 
It has also been suggested that U2AF participates in the 
nuclear export of mRNA [45]. 

U2AF65 binds to single-stranded RNA and recognizes 
a wide variety of pyrimidine (Py) -tracts. The Py-tracts of 
higher eukaryotic pre-mRNAs are often interrupted with 
purines, yet U2AF65 must identify these degenerate Py- 
tracts for accurate pre-mRNA splicing. Based on in vitro 
studies, investigators have proposed that U2AF35 assists 
U2AF65 recruitment to nonconsensus polypyrimidine 
tracts. Pacheco et al analyzed the roles of the two U2AF 
subunits in vivo in the selection of alternative 3' splice 
sites associated with polypyrimidine tracts of different 
strengths. Their results revealed a feedback mechanism 
by which RNA interference-mediated depletion of 
U2AF65 triggers down regulation of U2AF35 expression. 
They also showed that knockdown of each U2AF sub- 
unit inhibits weak 3' splice site recognition, while over- 
expression of U2AF65 alone is sufficient to activate se- 
lection of this splice site [46,47]. It would be interesting 
to examine if over-expression of U2AF65 alone in the 
context of cancer activates splicing of weak or noncon- 
sensus polypyrimidine tracts that could tip the balance 
of splicing regulation in a subset of cellular transcripts 
which could promote tumorigenesis. 

The proteins we identified in RKO nuclear extracts 
using biotin triplex DNA affinity were PSF, a 100-kDa pro- 
tein that also binds to the polypyrimidine tract, and its 
heterodimeric binding partner p54nrb. We speculate that 
the 100- and 60-kDa proteins identified in previous stud- 
ies using Southwestern blotting with HeLa nuclear 
extracts [16] probed with the same purine triplex DNA 
probe used in this study are indeed PSF and p54nrb, but 
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Figure 6 (See legend on next page.) 
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(See figure on previous page.) 

Figure 6 Western blots of U2AF65, PSF, p54nrb, and beta-catenin expression in normal and tumor colorectal tissue extracts. Total 
protein (25 ug) from cytoplasmic (cy) and nuclear (nu) tissue extracts obtained from six selected patients were separated using 10% SDS-PAGE 
and electro-transferred to nitrocellulose membranes. Blots were incubated with the antibodies against U2AF65, PSF, p54nrb, beta-catenin, and 
actin, then the appropriate secondary antibody and detected using chemiluminescence and autoradiography. Each patient's tumor stage and 
number which are also included in Figure 1, and corresponding EMSA H3 values are shown above the samples. 



this has yet to be tested Both PSF and p54nrb bind to 
double-stranded (ds)DNA, single-stranded (ss)DNA, and 
RNA, and contain DNA- and RNA-binding domains. PSF 
participates in constitutive pre-mRNA splicing and is a 
component of later spliceosomal B and C complexes 
(when U2AF65 is no longer present). PSF and p54nrb also 
bind and function in nuclear retention of defective RNAs 
and are involved in transcriptional regulation and the 
DNA damage response [48-51]. Interestingly, PSF also 
functions in DNA annealing, where PSF requires ssDNA 
and dsDNA with sequence homology for their in vitro 
pairing activity as well as divalent cations. PSF can pro- 
mote the incorporation of ssDNA within the two sepa- 
rated strands of a homologous superhelical DNA duplex 
and produce a three-stranded D-loop structure, which is 
required for homologous recombination. Other splicing 
factors SF2/ASF and U2AF65 also caused DNA annealing 
but could not form D loops [52]. PSF and p54nrb, as well 
as GRSF-1, YB-1, and polypyrimidine tract-binding pro- 
tein (PTB) also bind to the MYC family of internal ribo- 
some entry sites (IRES) and positively regulate translation 
of the Myc family of oncoproteins in vitro and in vivo 
[53]. Protein array data in this study showed that expres- 
sion of both PSF and p54nrb in colorectal tissue extracts 
correlated significantly with c-Myc expression levels, 
which is consistent with a role for PSF and p54nrb in the 
regulation of c-Myc protein expression. 

Researchers identified both U2AF and PSF, as well as 
hnRNP C and PTB, as RNA-binding proteins that bind 
to two regions 3' of the (CUG) n repeat expansion in the 
3'-UTR of the DMPK gene, where expansion of this tri- 
nucleotide repeat causes the neuromuscular disorder 
myotonic dystrophy [54]. Their study explored RNA- 
binding proteins interacting with non-CUG regions or 
higher order structures in the DMPK 3'-UTR that may 
be involved in RNA-mediated pathogenesis. Their find- 
ing that both U2AF and PSF can bind near this triplet 
repeat sequence with the potential to form higher order 
structures such as triplexes is consistent with our data 
on biotin triplex DNA affinity identification of both 
U2AF65 and PSF. Another group identified an RNA/ 
protein complex in both Drosophila and 293 cells that 
consisted of expanded CAG RNA, U2AF65, and the 
NXF1 nuclear export receptor, providing further evi- 
dence that in other models, U2AF65 interacts with these 
triplet repeat sequences [55]. We believe that the purine 
triplex DNA EMSA probe can be a surrogate multiplex 



nucleic acid structure that acts as a "bait and hook" to 
capture proteins that may be binding D-loops, R-loops, 
triplexes, G-quadruplexes, or other multi-stranded struc- 
tures containing Hoogsteen or reverse Hoogsteen base 
pairs in vivo. 

PTB also binds to polypyrimidine tracts in pre-mRNAs, 
and numerous studies have shown that PTB competes 
with U2AF65 for binding to these sequences [56-61]. 
Since PSF is a PTB-associated protein, binding competi- 
tion between PSF and U2AF65 may be possible as well, 
which may explain why we identified both PSF with the 
biotinylated triplex DNA in RKO nuclear extracts and 
U2AF65 in RKO cytoplasmic extracts. Gama-Carvalho 
and colleagues performed immunoprecipitation of 
U2AF65- and PTB-associated RNAs from HeLa cells fol- 
lowed by microarray analysis to determine which mRNAs 
are associated with these two splicing factors that can 
compete for binding to polypyrimidine tracts [36]. Among 
U2AF65-associated mRNAs was a predominance of tran- 
scription factors and cell cycle regulators, whereas PTB- 
associated transcripts were enriched in mRNAs that en- 
code proteins implicated in intracellular transport, vesicle 
trafficking, and apoptosis. 

Related to cancer, researchers found that 2 of 14 patients 
with malignant mesothelioma, a pulmonary malignancy, 
had antibodies against U2AF65 using the SEREX tech- 
nique (serologic identification by recombinant expression 
cloning) [62]. Additionally, a patient with liver cirrhosis 
that progressed to hepatocellular carcinoma had antinuc- 
lear antibodies that recognized a nuclear protein putatively 
identified as U2AF65 [63]. Other splicing factors, most 
notably SFRS1 (ASF/SF2), are reported to be over- 
expressed in colon, thyroid, kidney, lung and breast cancer 
cells [64]. Other splicing factors shown to be over- 
expressed in colorectal cancer cells are hnRNP- F and -K, 
SPF45, and SRPK1 [64]. However, the present report is the 
first to describe correlation of increased expression or 
binding activity of U2AF65 in primary colorectal tumors 
with tumor stage, lymph node disease, metastasis and 
reduced overall survival. 

Why U2AF65 is over-expressed in colorectal tumor 
cells, and whether this over-expression is important to 
the development and/or progression of colorectal cancer 
or a passive effect of general gene deregulation are un- 
known. About 75% of sporadic colorectal cancers are 
characterized by a chromosomal instability (CIN) pheno- 
type. The most common reported chromosomal losses 
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involve 5q (APC), 18q (DCC), and 17p (p53), while the 
most common gains involve 8q and 20q. The gene en- 
coding U2AF65 (U2AF2) is located at cl9ql3.42. 
Chromosomal amplifications at cl9ql3.42 have been 
found in a rare embryonal tumor using array CGH and 
FISH [65,66]. Other groups have reported amplifications 
or aberrations at cl9ql3 in colorectal tumors, particu- 
larly in liver metastases compared to primary tumors 
[67], and in other solid tumors including pancreatic [68] 
and ovarian [69]. 

Regarding genomic instability, Vasquez and colleagues 
recently showed that both non-B DNA sequences and 
WRN helicase deficiency induce mutations characterized 
by single base changes, mostly at C-G base pairs, in an 
additive but not synergistic manner [70]. Because no syn- 
ergy was observed, the authors concluded that a role for 
WRN in reducing mutation frequencies via a mechanism 
dependent on its cellular helicase activity (for example, 
of non-B DNA sequences) is unlikely. Their data do not 
directly support our present hypothesis, which is similar 
to their hypothesis that if one function of the WRN heli- 
case were to resolve non-B (triplex and Z-DNA) struc- 
tures, as observed in vitro, then mutation frequencies 
may be higher in WRN-deficient cells than in WRN-wild 
type cells because both the number and stability of such 
structures would be greater in WRN-deficient cells. 
However, they did verify that purified WRN protein was 
able to unwind the third purine-rich strand of a synthetic 
triplex in vitro. Although our data suggest a correlation 
between expression of the WRN helicase with triplex 
DNA-binding activity in both normal and tumor tissue 
extracts, defining a functional role and mechanism of 
non-B DNA unwinding activity by WRN helicase and 
G*G multiplex binding (for example, by U2AF65) will re- 
quire further study. 

Beta-catenin, as a transcription factor complexed with 
TCF4, is known to upregulate expression of many rele- 
vant proteins in colorectal cancer, such as c-myc, cyclin 
Dl, LEF-1, CD44, and c-jun. Whether beta-catenin 
influences the expression of U2AF65 is unknown, but a 
search of transcription factor binding sites in the 
U2AF65 (U2AF2) gene promoter did not indicate any 
beta-catenin or TCF family transcription factor sites 
among the 55 high-scoring (>85%) sites we identified 
(Cold Spring Harbor Laboratory Mammalian Promoter 
Database http://rulai.cshl.edu/CSHLmpd2/; Transcription 
Factor Search http://www.cbrc.jp/research/db/TFSEARCH. 
html). Similarly, mining through microarray expression 
studies revealed no reports describing U2AF65 (U2AF2) as 
a beta-catenin, TCF4, or Wnt target gene (NCBI GEO; R 
Nusse Wnt/Beta catenin targets list: www.stanford.edu/ 
-rnusse/pathways/ targets.html). The biological significance 
of the correlation of U2AF65 and beta-catenin expression 
in colorectal tumor tissues, such as if beta-catenin as a 



transcription factor affects U2AF65 expression, or if 
U2AF65 as a splicing factor affects the splicing or expres- 
sion of beta-catenin, remains to be determined. 

Several studies have examined the interaction of beta- 
catenin with splicing factors and the role of beta-catenin 
in mRNA splicing. Researchers identified alternative spli- 
cing of SLC39A14, a divalent cation transporter, in colo- 
rectal tumors and found it to be regulated by the Wnt 
pathway, probably through regulation of splicing factor 
SRSF1 [71]. The beta-catenin/ TCF4 pathway also modifies 
alternative splicing through modulation of expression of 
splicing factors SRp20 [72] and SF1 [73] and direct inter- 
action with FUS/TLS (translocated in liposarcoma) and 
various other RNA-binding proteins, including p54nrb 
[74]. Others have shown that beta-catenin regulates mul- 
tiple steps of RNA metabolism in colon cancer cells and 
may coordinate RNA metabolism [75]. 

Authors have also reported identification of truncated 
beta-catenin isoforms, mostly in colorectal cancer cells. 
In primary colorectal tumors, a relatively small percent 
(7 of 58 examined) contained somatic interstitial deletions 
that included all or part of exon 3 of the beta-catenin gene, 
and RT-PCR analysis from 3 of the 7 tumors detected tran- 
scripts that lacked exon 3 and the presence of the normal 
transcript [39]. Researchers also detected two novel beta- 
catenin mRNA splice variants in the SW480 colon cancer 
cell line and in primary colorectal tumors [38] . A truncated 
beta-catenin protein of 80-kDa was also detected in three 
colorectal metastases to the liver [40]. Several of these iso- 
forms have truncations in the NH 2 -terminus of the protein 
that produce deletions of key serine and threonines that 
are phosphorylated by GSK-3 beta, which is important for 
proteosomal degradation, which was hypothesized to 
stabilize the protein and have a dominant oncogenic effect 
[76]. Data from this and other studies lead us to speculate 
that U2AF65 could be binding to a multi-stranded nucleic 
acid structure such as R-loops, D-loops, or G-quartet 
mRNA in vivo that is mimicked by the purine triplex DNA 
probe in our study, and that overexpression or increased 
EMSA binding activity of U2AF65 in tumor tissues could 
cause deregulation of mRNA splicing and protein isoform 
expression, such as beta-catenin, that could contribute to 
colorectal cancer initiation and/or progression. 

Conclusions 

We found that increased triplex DNA-binding activity 
in colorectal tumor extracts in vitro is associated with 
WRN helicase expression, increased total beta-catenin 
expression, lymph node disease, metastasis, and 
reduced overall survival in patients with colorectal 
cancer. Multifunctional splicing factor U2AF65 was 
identified as the major triplex-binding protein in 
human tissues and cell lines. Increased expression of 
U2AF65 is also associated with expression of splicing 
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factors PSF and p54nrb, a higher tumor stage, and 
increased truncation of beta-catenin in colorectal 
tumors. We believe that our results contribute to and 
generate interest in the growing fields of alternative 
non-B DNA structures and genomic instability, aber- 
rantly regulated splicing factors, mRNA splicing and 
protein isoforms related to cancer both as basic re- 
search objectives regarding the etiology of cancer and 
cancer diversity and as novel translational research in 
the search for promising prognostic, diagnostic and 
targeting tools. 



Additional file 6: histograms_proteins_groups. 

Additional file 7: Table SI. RPPA antibodies and Spearman correlation 
p values. 
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extract (H) was used as a control in lanes 6 and 12. Purine triplex probe 
alone is in lane 1 and duplex probe alone is in lane 7. Figure S2a. 
Electrophoretic Mobility Shift Assay (EMSA) of Cytoplasmic and Nuclear 
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total protein from cytoplasmic (cy) or nuclear (nuc) extracts from eight 
colorectal cancer cell lines. 1.25 ug HeLa cytoplasmic and nuclear extracts 
were used as positive (+) controls. Each reaction also contained 2 ug 
poly (dl-dC) carrier DNA. The purine triplex DNA probe alone is shown in 
lane 1. Figure S2b. Western blots showing expression of three candidate 
triplex DNA-binding proteins in eight colorectal cancer cell lines. Total 
protein (25 ug) from cytoplasmic (cy) and nuclear (nu) extracts from 
eight colorectal cancer cell lines were separated using 10% SDS-PAGE 
and electro-transferred to nitrocellulose membranes. Blots were 
incubated with the antibodies against PSF, U2AF65, p54nrb, beta-catenin, 
and actin, then the appropriate secondary antibody and detected using 
chemiluminescence and autoradiography. Figure S3. Lack of a super- 
shifted H3 band in RKO nuclear extract by super-shift EMSA with 
antibodies against PSF and p54nrb. 33 P-labeled triplex DNA (1 nM) was 
complexed with 1.5 ug total protein from RKO nuclear extracts 
(lanes 2-9). Lane 1, triplex DNA probe alone; Lane 2, no antibody; lane 3, 
400 ng anti-U2AF65 antibody MC3; lane 4, 1000 ng 
anti-U2AF65 antibody MC3; lane 5, 400 ng anti-PSF antibody; lane 6 1000 
ng anti-PSF antibody; lane 7, 400 ng anti-p54nrb antibody; lane 8, 1000 
ng anti-p54nrb antibody; lane 9, mouse IgG antibody (negative control). 
Each reaction also contained 2 ug poly (dl-dC) carrier DNA. Figure S4. 
Quantitation of Protein Expression of PSF, U2AF65, p54nrb, and beta- 
catenin obtained from six colorectal cancer patients' tissue extracts. 
Autoradiographs from Western blots in Figure 6 were scanned, and 
protein expression bands were quantitated using NIH Image J. Protein 
expression was normalized by dividing by the samples' corresponding 
actin value and graphed using Graph Pad. Figure S5. Beta-catenin 
Expression by Tumor type and Stage. Western blots using an 
anti-beta-catenin antibody to examine expression in patient extracts were 
described for Figure 6. Beta-catenin expression values were normalized 
by dividing the actin expression value in each extract, and plotted 
according to colon or rectum tumor stage using the R program. N cyto, 
cytoplasmic normal tissue extracts; N nuc, nuclear normal tissue extracts; 
T cyto, cytoplasmic tumor tissue extracts; T nuc, nuclear tumor tissue 
extracts. 
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